home *** CD-ROM | disk | FTP | other *** search
- Specification for RTF
-
- ---------------------
-
- RTF text is a form of encoding of various text
-
- formatting properties, document structures,
-
- and document properties, using the printable
-
- ASCII character set. Special characters can be also
-
- thus encoded, although RTF does not prevent the utilization
-
- of character codes outside the ASCII printable set.
-
- The main encoding mechanism of "control words" provides a name
-
- space that may be later used to expand the realm of RTF with
-
- macros, programming, etc.
-
- 1. BASIC INGREDIENTS
-
- Control words are of the form:
-
- \lettersequence <delimiter> where <delimiter>. is:
-
- . a space: the space is part of the control word.
-
- . a digit or - means that a parameter follows. The following
- digit
-
- sequence is then delimited by a space or any other
-
- non-letter-or-digit as for control words.
-
- . any other non-letter-or digit: terminates the control word,
- but is not
-
- a part of the control word.
-
- By "letter:, here we mean just the upper and lower case ASCII
- letters.
-
- Control symbols consist of a \ character followed by a single
-
- non-letter. They require no further delimiting.
-
- Notes: control symbols are compact, but there are not too many
-
- of them. The number of possible control words are not limited.
-
- The parameter is partially incorporated in control symbols, so
- that
-
- a program that does not understand a control symbol can recognize
-
- and ignore the corresponding parameter as well.
-
- In addition to control words and control symbols, there are also
- the braces:
-
- { group start, and
-
- } group end. The text grouping will be used for formatting
-
- and to delineate document structure - such as the footnotes,
- headers,
-
- title, and so on. The control words, control symbols, and braces
-
- constitute control information. All other characters in RTF text
-
- constitute "plain text".
-
- Since the characters \, {, and } have specific uses in RTF,
-
- the control symbols \\,\{, and \} are provided to express
-
- the corresponding plain characters.
-
- 2. WHAT RTF TEXT MEANS (SEMANTICS)
-
- The reader of a RTF stream will be concerned with:
-
- Separating control information from plain text.
-
- Acting on control information. This is designed to be
-
- a relatively simple process, as described below.
-
- Some control information just contributes special
-
- characters to the plain text stream. Other information
-
- serves to change the "program state" which includes
-
- properties of the document as a whole and also a stack
-
- of "group states" that apply to parts.
-
- Note that the group state is saved by the { brace and is
-
- restored by the } brace. The current group state specifies:
-
- 1. the "destination" or part of the document that the
-
- plain text is building up.
-
- 2. the character formatting properties - such as bold or
-
- italic.
-
- 3. the paragraph formatting properties - such as justified.
-
- 4. the section formatting properties - such as number of columns.
-
- Collecting and properly disposing of the remaining "plain text"
-
- as directed by the current group state.
-
- In practice the RTF reader will proceed as follows:
-
- 0. read next char
-
- 1. if ={
-
- stack current state. current state does not change.
-
- continue.
-
- 2. if =}
-
- unstack current state from stack. this will change the
-
- state in general.
-
- 3. if =\
-
- collect control word/control symbol and parameter, if any.
-
- look up word/symbol in symbol table (a constant table)
-
- and act according to the description there. The different
-
- actions are listed below. Parameter is left available for use by
- the action.
-
- Leave read pointer before or after the delimiter, as appropriate.
-
- After the action, continue.
-
- 4. otherwise, write "plain text" character to current destination
-
- using current formatting properties.
-
- Given a symbol table entry, the possible actions are as follows:
-
- A. Change destination:
-
- change destination to the destination described in the entry.
-
- Most destination changes are legal only immediately after a {.
-
- Other restrictions may also apply (for example, footnotes
-
- may not be nested.)
-
- B. Change formatting property:
-
- The symbol table entry will describe the property and
-
- whether the parameter is required.
-
- C. Special character:
-
- The symbol table entry will describe the character code..
-
- goto 4.
-
- D. End of paragraph
-
- This could be viewed as just a special character.
-
- E. End of section
-
- This could be viewed as just a special character.
-
- F. Ignore
-
- 3. SPECIAL CHARACTERS
-
- The special characters are explained as they exist in Mac Word.
-
- Clearly, other characters may be added for interchange with other
-
- programs. If a character name is not recognized by a reader,
- according
-
- to the rules described above, it will be simply ignored.
-
- \chpgn current page number (as in headers)
-
- \chftn auto numbered footnote reference
-
- (footnote to follow in a group)
-
- \chpict placeholder character for picture
-
- (picture to follow in a group)
-
- \chdate current date (as in headers)
-
- \chtime current time (as in headers)
-
- \| formula character
-
- \~ non-breaking space
-
- \- non-required hyphen
-
- \_ non-breaking hyphen
-
- \page required page break
-
- \line required line break (no paragraph break)
-
- \par end of paragraph.
-
- \sect end of section and end of paragraph.
-
- \tab same as ASCII 9
-
- For simplicity of operation, the ASCII codes 9 and 10 will be
- accepted
-
- as \tab and \par respectively. ASCII 13 will be ignored. The
- control
-
- code \<10> will be ignored. It may be used to include "soft"
- carriage
-
- returns for easier readability but which will have no effect
-
- on the interpretation.
-
- 4. DESTINATIONS
-
- The change of destination will reset all properties to default.
- Changes
-
- are legal only at the beginning of a group (by group here we
- mean the
-
- text and controls enclosed in braces.)
-
- \rtf<param>
-
- The destination is the document. The parameter is the
-
- version number of the writer. This destination preceded
-
- by { the beginnings of RTF documents and the corresponding }
-
- marks the end.
-
-
-
- Legal only once after the initial {. Small scale interchange of
- RTF where
-
- other methods for marking the end of string are available, as in
- a
-
- string constant, need not include this identification but
-
- will start with this destination as the default.
-
- \pict
-
- The destination is a picture. The group must immediately
-
- follow a \chpict character. The plain text describes
-
- the picture as a hex dump (string of characters 0,1,...
-
- 9, a, ..., e, f.)
-
- (Formatting properties to determine data interpretation, size)
-
- \footnote
-
- The destination is a footnote text. The group must
-
- immediately follow the footnote reference character(s).
-
- \header
-
- The destination is the header text for the current section.
-
- The group must precede the first plain text character
-
- in the section.
-
- \headerl
-
- Same as above, but header for left-hand pages.
-
- \headerr
-
- Same as above, but header for right-hand pages.
-
- \headerf
-
- Same as above, but header for first page.
-
- \footer
-
- Same as above, but footer.
-
- \footerl
-
- Same as above, but footer for left-hand pages.
-
- \footerr
-
- Same as above, but footer for right-hand pages.
-
- \footerf
-
- Same as above, but header for first page.
-
- \ftnsep
-
- Same as above, but text is footnote separator
-
- \ftnsepc
-
- Same as above, but text is separator for continued footnotes.
-
- \ftncn
-
- Same as above, but text is continued footnote notice.
-
- \info
-
- text is information block for the document. Parts of the
-
- text is further classified by "properties" of the text
-
- that are listed below - such as "title". These are not
-
- formatting properties, but a device to delimit and identify
-
- parts of the info from the text in the group.
-
- \stylesheet
-
- text is the style sheet for the document.
-
- More precisely, text between semicolons are taken to be
-
- style names which will be defined to stand for the
-
- formatting properties which are in effect.
-
- \fonttbl
-
- font table. See below.
-
- \colortbl
-
- color table. See below.
-
- \comment
-
- text will be ignored.
-
- 5. DOCUMENT FORMATTING PROPERTIES
-
- (000 stands for a number which may be signed)
-
- \paperw000 paper width in twips 12240
-
- \paperh000 paper height 15840
-
- \margl000 left margin 1800
-
- \margr000 right margin 1800
-
- \margt000 top margin 1440
-
- \margb000 bottom margin 1440
-
- \facingp facing pages
-
- \gutter000 gutter width
-
- \deftab000 default tab width 720
-
- \widowctrl enable widow control
-
- \endnotes footnotes at end of section
-
- \ftnbj footnotes at bottom of page default
-
- \ftntj footnotes beneath text (top just)
-
- \ftnstart000 starting footnote number 1
-
- \ftnrestart restart footnote numbers each page
-
- \pgnstart000 starting page number 1
-
- \linestart000 starting line number 1
-
- \landscape printed in landscape format
-
- (the "next file" property will be encoded in the info text )
-
- 6. SECTION FORMATTING PROPERTIES
-
- \sectd reset to default section properties
-
- \nobreak break code
-
- \colbreak break code default
-
- \pagebreak break code
-
- \evenbreak break code
-
- \oddbreak break code
-
- \pgnrestart restart page numbers at 1
-
- \pgndec page number format decimal default
-
- \pgnucrm page number format uc roman
-
- \pgnlcrm page number format lc roman
-
- \pgnucltr page number format uc letter
-
- \pgnlcltr page number format lc letter
-
- \pgnx000 auto page number x pos 720
-
- \pgny000 auto page number y pos 720
-
- \linemod000 line number modulus
-
- \linex000 line number - text distance 360
-
- \linerestart line number restart at 1 default
-
- \lineppage line number restart on each page
-
- \linecont line number continued from prev section
-
- \headery000 header y position from top of page 720
-
- \footery000 footer y position from bottom of page 720
-
- \cols000 number of columns 1
-
- \colsx000 space between columns 720
-
- \endnhere include endnotes in this section
-
- \titlepg title page is special
-
- 7. PARAGRAPH FORMATTING PROPERTIES
-
- \pard dreset to default para properties.
-
- \s000 style
-
- \ql quad left default
-
- \ql right
-
- \qj justified
-
- \qc centered
-
- \fi000 first line indent
-
- \li000 left indent
-
- \ri000 right indent
-
- \sb000 space before
-
- \sa000 space after
-
- \sl000 space between lines
-
- \keep keep
-
- \keepn keep with next para
-
- \sbys side by side
-
- \pagebb page break before
-
- \noline no line numbering
-
- \brdrt border top
-
- \brdrb border bottom
-
- \brdrl border left
-
- \brdrr border right
-
- \box border all around
-
- \brdrs single thickness
-
- \brdrth thick
-
- \brdrsh shadow
-
- \brdrdb double
-
- \tx000 tab position
-
- \tqr right flush tab (these apply to last specified
- pos)
-
- \tqc centered tab
-
- \tqdec decimal aligned tab
-
- \tldot leader dots
-
- \tlhyph leader hyphens
-
- \tlul leader underscore
-
- \tlth leader thick line
-
- 8. CHARACTER FORMATTING PROPERTIES
-
- \plain reset to default text properties.
-
- \b bold
-
- \i italic
-
- \strike strikethrough
-
- \outl outline
-
- \shad shadow
-
- \scaps small caps
-
- \caps all caps
-
- \v invisible text
-
- \f000 font number n
-
- \fs000 font size in half points 24
-
- \ul underline
-
- \ulw word underline
-
- \uld dotted underline
-
- \uldb double underline
-
- \up000 superscript in half points
-
- \dn000 subscript in half points
-
- 9. INFO GROUP
-
- The plain text in the group is used to specify the various
- fields of the information block. The current field may be
- thought of as a particular setting of the "sub-destination"
- property of the text..
-
- \title following plain text is the title
-
- \subject following text is the subject
-
- \operator
-
- \author
-
- \keywords
-
- \doccomm comments (not to be confused with \comment )
-
- \version
-
- \nextfile following text is name of "next" file
-
- The other properties assign their parameters directly to the
- info block.
-
- \verno000 internal version number
-
- \creatim creation time follows
-
- \yr000 year to be assigned to previously specified time
- field
-
- \mo000
-
- \dy000
-
- \hr000
-
- \min000
-
- \sec000
-
- \revtim revision time follows
-
- \printtim print time follows
-
- \buptim backup time follows
-
- \edmins00 editing minutes
-
- \nofpages000
-
- \nofwords000
-
- \noofchars000
-
- \id000 internal ID number
-
-
-
-